Skip to content

[herd] Global count of useful hardware update#1733

Open
maranget wants to merge 5 commits intomasterfrom
global-hwupdates
Open

[herd] Global count of useful hardware update#1733
maranget wants to merge 5 commits intomasterfrom
global-hwupdates

Conversation

@maranget
Copy link
Copy Markdown
Member

@maranget maranget commented Mar 2, 2026

This draft PR is an evolution of PR #1730. The number of useful hardware updates is evaluated by a global scan of events at the very end of execution candidate construction. The scan counts the number of explicit write effects to locations that can be page table entries of values whose AF flag can be zero.

The technique sounds more general and should yield a better upper bound than the local technique of PR #1730. Some points are still to consider :

  • Simple identification of the case of a failing CAS that nevertheless writes the value read from memory into memory again. In such a case, there is no need to count another hardware update. (notice that, in PR [herd] Attempt to limit the number of spurious AF updates #1730, this situation is identifierd by miracle).
  • Totally untested on ASL.

@maranget maranget marked this pull request as ready for review March 30, 2026 08:15
The function [delay_kont] can be used to extract the value returned
by a monad (second argument below, type `'a`).
```
  val delay_kont : string -> 'a t -> ('a ->  'a t -> 'b t) -> 'b t
```

The continuation function `(fun v mv -> ... )` can then examine
the returned value `v` and combine the monad `mv` independently,
which proves very convenient in many occasion.

The change performed by this commit permits affine (_i.e._ one or zero
effective occurrence), while linear usage (exactly one occurrence) was
mandatory before.
  + Efficient group function: sort, then group.
  + Suffix based generators:
     - generate all suffixes,
     - cross product of suffixes.
When TTHM=HA or TTHM=HD are active, HW update of the AF flag
is performed. This include the so-called "spurious" updates
that are performed independently of test code.

For efficiency reason we limit the number of such spurious
updates to what is necessary. We do so by a global
scan of the execution candidates  counting the
writes that may unset the AF flag in the final
set of effects. Notice that we also consider the
initial writes in this scan.

We perform one optimisation: by exception, when a
write effect value is the same as the value
read by the same instruction from the same location,
there is no need to add a supplementary spurious update
as the (potential) update associated to the write that
stored the value has already been counted and is sufficient.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants